Interpretable Projection Pursuit*
نویسندگان
چکیده
The goal of this thesis is to modify projection pursuit by trading accuracy for interpretability. The modification produces a more parsimonious and understandable model without sacrificing the structure which projection pursuit seeks. The method retains the nonlinear versatility of projection pursuit while clarifying the results. Following an introduction which outlines the dissertation, the first and second chapters contain the technique as applied to exploratory projection pursuit and projection pursuit regression respectively. The interpretability of a description is measured as the simplicity of the coefficients which define its linear projections. Several interpretability indices for a set of vectors are defined based on the ideas of rotation in factor analysis and entropy. The two methods require slightly different indices due to their contrary goals. A roughness penalty weighting approach is used to search for a more parsimonious description, with interpretability replacing smoothness. The computational algorithms for both interpretable exploratory projection pursuit and interpretable projection pursuit regression are described. In the former case, a rotationally invariant projection index is needed and defined. In the latter, alterations in the original algorithm are required. Examples of real data are considered in each situation. The third chapter deals with the connections between the proposed modification and other ideas which seek to produce more interpretable models. The Abstract Page iv modification as applied to linear regression is shown to be analogous to a nonlinear continuous method of variable selection. It is compared with other variable selection techniques and is analyzed in a Bayesian context. Possible extensions to other data analysis methods are cited and avenues for future research are identified. The conclusion addresses the issue of sacrificing accuracy for parsimony in general. An example of calculating the tradeoff between accuracy and interpretability due to a common simplifying action, namely rounding the binwidth for a histogram, illustrates the applicability of the approach.Page iv modification as applied to linear regression is shown to be analogous to a nonlinear continuous method of variable selection. It is compared with other variable selection techniques and is analyzed in a Bayesian context. Possible extensions to other data analysis methods are cited and avenues for future research are identified. The conclusion addresses the issue of sacrificing accuracy for parsimony in general. An example of calculating the tradeoff between accuracy and interpretability due to a common simplifying action, namely rounding the binwidth for a histogram, illustrates the applicability of the approach.
منابع مشابه
Functional Projection Pursuit
This article describes the adaption of exploratory projection pursuit for use with functional data. The aim is to nd \interesting" projections of functional data: e.g. to separate curves into meaningful clusters. Functional data are projected onto low-dimensional subspaces determined by a projection function using a suitable inner product. Such a projection is rapidly computed by representing d...
متن کاملانجام یک مرحله پیش پردازش قبل از مرحله استخراج ویژگی در طبقه بندی داده های تصاویر ابر طیفی
Hyperspectral data potentially contain more information than multispectral data because of their higher spectral resolution. However, the stochastic data analysis approaches that have been successfully applied to multispectral data are not as effective for hyperspectral data as well. Various investigations indicate that the key problem that causes poor performance in the stochastic approaches t...
متن کاملSparse Coding for Learning Interpretable Spatio-Temporal Primitives
Sparse coding has recently become a popular approach in computer vision to learn dictionaries of natural images. In this paper we extend the sparse coding framework to learn interpretable spatio-temporal primitives. We formulated the problem as a tensor factorization problem with tensor group norm constraints over the primitives, diagonal constraints on the activations that provide interpretabi...
متن کاملA Note on Projection Pursuit
I provide a historic review of the forward and backward projection pursuit algorithms, previously thought to be equivalent, and point out an important difference between the two. In doing so, I correct a small error in the original exploratory projection pursuit paper (Friedman 1987). The implication of the difference is briefly discussed in the context of an application in which projection pur...
متن کامل